What’s the law around aggregating news online? A Harvard Law report on the risks and the best practices

By Kimberley Isbell @kisbell Sept. 8, 2010, 10:30 a.m.

[So much of the web is built around aggregation — gathering together interesting and useful things from around the Internet and presenting them in new ways to an audience. It’s the foundation of blogging and social media. But it’s also the subject of much legal debate, particularly among the news organizations whose material is often what’s being gathered and presented. Kimberley Isbell of our friends the Citizen Media Law Project has assembled a terrific white paper on the current state of the law surrounding aggregation — what courts have approved, what they haven’t, and where the (many) grey areas still remain. This should be required reading for anyone interested in where aggregation and linking are headed.

You can get the full version of the paper (with footnotes) here; I’ve added some links for context. This work was derived in part from CMLP’s spring conference, where I was lucky enough to speak; we’ll be presenting some more content from that conference in the coming days. And finally, if this kind of subject interests you, you should think about attending CMLP’s upcoming conference in Atlanta on “Media Law in the Digital Age,” done in conjunction with our other friends at the Center for Sustainable Journalism. —Josh]

During the past decade, the Internet has become an important news source for most Americans. According to a study conducted by the Pew Internet and American Life Project, as of January 2010, nearly 61 percent of Americans got at least some of their news online in a typical day. This increased reliance on the Internet as a source of news has coincided with declining profits in the traditional media and the shuttering of newsrooms in communities across the country. Some commentators look at this confluence of events and assert that, in this case, correlation equals causation — the Internet is harming the news business.

One explanation for the decline of the traditional media that some, including News Corporation owner Rupert Murdoch and Associated Press Chairman Dean Singleton, have seized upon is the rise of the news aggregator. According to this theory, news aggregators from Google News to The Huffington Post are free-riding, reselling and profiting from the factual information gathered by traditional media organizations at great cost. Rupert Murdoch has gone so far as to call Google’s aggregation and display of newspaper headlines and ledes “theft.” As the traditional media are quick to point out, the legality of a business model built around the monetization of third-party content isn’t merely an academic question — it’s big business. Revenues generated from online advertising totaled $23.4 billion in 2008 alone.

Building a business model around monetizing another website’s content isn’t novel, and methods for doing so have been around for almost as long as the Internet has been a commercial platform. Consider the practice of framing, or superimposing ads onto embeded websites. There’s also inline linking, or incorporating content from multiple websites into one single third-party site. These days, it’s news aggregators that are generating a lot of scrutiny. But are they legal?

What is a news aggregator?

Before tackling the legal questions implicated by news aggregators, we should first define the term. At its most basic, a news aggregator is a website that takes information from multiple sources and displays it in a single place. While the concept is simple in theory, in practice news aggregators take many forms. For this reason, any attempt to talk about the legal issues surrounding “news aggregation” is bound to fail, unless we take into consideration the relevant differences among the various models. For the purposes of our discussion, we will group news aggregators into four categories: Feed Aggregators, Specialty Aggregators, User-Curated Aggregators, and Blog Aggregators.

Feed Aggregators: As used in this discussion, a “Feed Aggregator” is closest to the traditional conception of a news aggregator, namely, a website that contains material from a number of websites organized into various “feeds,” typically arranged by source, topic, or story. Feed Aggregators often draw their material from a particular type of source, such as news websites or blogs, although some Feed Aggregators will contain content from more than one type of source. Some well known examples are Yahoo! News (and its sister site, My Yahoo!) and Google News. Feed Aggregators generally display the headline of a story, and sometimes the first few lines of the story’s lede, with a link to where the rest of the story appears on the original website. The name of the originating website is often listed, as well.

Specialty Aggregators: For the purposes of this white paper, a “Specialty Aggregator” is a website that collects information from a number of sources on a particular topic or location. Examples of Specialty Aggregators are hyper‐local websites like Everyblock and Outside.In and websites that aggregate information about a particular topic like Techmeme and Taegan Goddard’s Political Wire. Like Feed Aggregators, Specialty Aggregators typically display the headline of a story, and occasionally the first few lines of the lede with a link to the rest of the story, along with the name of the website on which the story originally appeared. Unlike Feed Aggregators, which cover many topics, Specialty Aggregators are more limited in focus and typically cover just a few topics or sources.

User‐Curated Aggregators: A “User‐Curated Aggregator” is a website that features user‐submitted links and portions of text taken from a variety of websites. Often, the links on a User‐Curated Aggregator will be culled from a wider variety of sources than most news aggregators, and will often include links to blog posts and multimedia content like YouTube videos, as well as links to more traditional media sources.

Blog Aggregators: Of the four types of news aggregators discussed in this paper, the final category, what we’re calling “Blog Aggregators,” looks the least like a traditional news aggregator. Blog Aggregators are websites that use third‐party content to create a blog about a given topic. The Gawker Media sites are perhaps one of the best known examples of Blog Aggregators, and also illustrate the different forms that the use of third‐party content can take on these sites. One method of using third‐party content on Blog Aggregators is as raw material for blogger‐written content, synthesizing information from a number of sources into a single story (occasionally, but not always, incorporating quotes from the original articles) and linking to the original content in the article, at the end, or both. Elsewhere, a post may consist of a two to three sentence summary of an article from a third‐party source, with a link to the original article. Yet other posts are composed of short excerpts or summaries from a number of articles strung together, all with links back to the original articles. Another popular Blog Aggregator is the Huffington Post, which likewise uses third‐party content in a number of different ways. The Huffington Post website is organized into several sections, the front pages of which typically feature links to a mixture of different types of content, including original articles authored by Huffington Post writers, AP articles hosted on the Huffington Post website, and articles hosted on third‐party websites. In linking to content on third party websites, the Huffington Post sometimes uses the original headline, and other times will use a headline written by Huffington Post editors.

Can they do that?

For all of the attention that news aggregators have received, no case in the United States has yet definitively addressed the question of whether their activities are legal. Only a small number of lawsuits have been brought against news aggregators, and all of them have settled before a final decision on the merits.

Before trying to answer the question of the legality of news aggregators under U.S. law, let’s take a closer look at the cases that have been brought to see what arguments both sides of the debate are making.

AFP v. Google News

While still a young company relying on private capital, Google launched a news aggregator in 2002 that was intended as a companion to its increasingly popular search engine. Using Google’s Internet search prowess to crawl through thousands of online media sources, Google News, as the service would be called, featured various news stories published over the past 30 days. At the time AFP filed suit, Google News displayed the headline, lede, and accompanying photo of articles published by the different news providers accessed by Google’s news crawler. Google also provided a link to the original story as it appeared on the website from which the story was accessed.

Many of the articles that appeared in Google News were written by wire services such as Agence France Presse (“AFP”) and The Associated Press, but displayed on third-party websites. Wire services like the AFP generally do not distribute news freely on their own websites as do many newspapers; instead, they license their content to other news providers, such as local newspapers. According to AFP, then, the headline, lede and photo displayed by Google News was licensed content, and the only parties that were authorized to publish them were those that paid licensing fees. By providing this content, even in an abbreviated form, AFP claimed, Google News was infringing their copyrights and stealing their product.

AFP filed a lawsuit against Google in federal district court in Washington, D.C. in 2005. The Amended Complaint asserted claims against Google for copyright infringement in AFP’s photos, headlines, and ledes; a claim for removal or alteration of AFP’s copyright management information; and a claim for “hot news” misappropriation. Google responded to AFP’s claims by filing two separate motions to dismiss: the first, based on AFP’s failure to identify with particularity all of those works it alleged Google to have infringed, and the second, a partial motion to dismiss AFP’s claim for copyright infringement of AFP’s headlines, on the grounds that the headlines constituted uncopyrightable subject matter.

After nearly two years of litigation and extensive discovery, AFP and Google settled the case, entering into a licensing deal granting Google the right to post AFP content, including news stories and photographs, on Google News and on other Google services.

Associated Press v. All Headline News

Almost three years later, the Associated Press (“AP”) filed a lawsuit against another news aggregator, All Headline News. On its website, All Headline News described itself as a “global news agency and content service.” According to the AP’s complaint, however, All Headline News “ha[d] no reporters,” and instead prepared its content by having employees “copy[] news stories found on the internet or rewrite[e] such stories.” All Headline News then repackaged and sold this content to clients that included newspapers, Internet web portals, websites, and other redistributors of news content. The AP asserted claims against All Headline News for “hot news” misappropriation, copyright infringement, removal or alteration of copyright management information, trademark infringement, unfair competition, and breach of contract.

All Headline News filed a partial motion to dismiss most of the AP’s claims, except the claim for copyright infringement. Nearly a year later, the Southern District of New York issued an order granting in part and denying in part All Headline News’ motion. The court dismissed the AP’s trademark infringement claims, but retained the remaining claims against All Headline News, including hot news misappropriation. Four months later, the parties settled. Under the settlement agreement, All Headline News agreed to cease using AP content and paid an unspecified sum “to settle the AP’s claim for past unauthorized use of AP expression and news content.”

GateHouse Media v. New York Times Co.

One of the more recent news aggregation cases pitted two traditional media companies against each other. GateHouse Media, which at the time operated more than 375 local newspapers and their respective websites, claimed that The New York Times Co. copied the headlines and ledes from GateHouse’s Wicked Local websites as part of its own local news aggregation effort on the Boston.com website. GateHouse’s Complaint asserted claims against The New York Times Co. for copyright infringement, trademark infringement, false advertising, trademark dilution, unfair competition, and breach of contract (for failure to comply with the provisions of the Creative Commons license under which the Wicked Local content was distributed).

Concurrently with filing the Complaint, GateHouse filed a motion requesting a temporary restraining order and preliminary injunction prohibiting The New York Times Co. from using content from the Wicked Local websites. The court denied GateHouse’s motion for a restraining order and consolidated the motion for a preliminary injunction with an expedited trial on the merits. The parties settled on the eve of trial, with both sides agreeing, among other things, to remove the others’ RSS feeds from their websites.

So is it legal?

As the foregoing discussion illustrates, there are two doctrines that need to be considered when attempting to determine whether news aggregation is legal: copyright and hot news misappropriation. We turn to each of these below.

Under U.S. copyright law, a work is protected if it (1) is an original work of authorship, and (2) is fixed in a tangible medium of expression that can be read directly or with the aid of a machine or device (i.e., is recorded or embodied in some manner for more than a transitory duration). With certain exceptions, the owner of a copyrighted work has the right to prohibit others from reproducing, preparing derivative works from, distributing copies of, or publicly performing or displaying the work.

While most news articles meet the second prong of the copyrightability test, this does not end the inquiry. To be protected by copyright, the material copied by the news aggregator also needs to be original (i.e., both independently created by the author and minimally creative). Under U.S. copyright law, ideas and facts cannot be copyrighted, but the way a person expresses those ideas or facts can be. It is also a generally accepted proposition of U.S. copyright law that titles and short phrases are not protected under copyright law.

These last two propositions are cited by many news aggregators to claim that the headlines of news stories (and, less frequently, the ledes) do not qualify for copyright protection, and thus the reproduction of this material on a news aggregator’s website does not constitute copyright infringement. According to this argument, a headline is an uncopyrightable title or short phrase. Moreover, the argument goes, headlines are highly factual and thus the merger doctrine would prohibit copyright protection. The merger doctrine denies protection to certain expressions of an idea (or set of facts) where the idea and its expression are so inseparable that prohibiting third parties from copying the expression would effectively grant the author protection of the underlying idea. In its litigation against AFP, Google asserted a variant of this argument. Noting that AFP’s headlines “often consist of fewer than 10 words,” Google argued that, though they may be “painstakingly created,” they were nonetheless not entitled to copyright protection because they “generally seek to encapsulate the factual content of the story,” and did not contain protectable original expression that was separable from their factual content. While this argument has some appeal when directed at short, highly factual headlines, it becomes a harder argument to make when directed at text from the article, such as the lede. For, as the Supreme Court noted in Feist Publications, Inc. v. Rural Telephone Service Co., Inc., the level of creativity required for a work to be “original” and thus protectable is extremely low — a work satisfies this requirement as long as it possesses some creative spark, “no matter how crude, humble or obvious it might be.”

Fair use

Assuming that headlines and ledes are copyrightable subject matter, a news aggregator’s reproduction of them is not actionable if its use of the material qualifies as a fair use. The Copyright Act sets forth four nonexclusive factors for courts to consider when determining whether a use qualifies as a fair use. These factors include: (1) The purpose and character of the use, including whether the use is of a commercial nature or is for nonprofit educational purposes; (2) The nature of the copyrighted work; (3) The amount and substantiality of the portion used in relation to the copyrighted work as a whole; and (4) The effect of the use upon the potential market for or value of the copyrighted work. This section will take each of these factors in turn, and apply them to the four categories of news aggregators previously discussed.

The purpose and character of the use. The first thing courts will consider when evaluating this factor whether the use is commercial in nature. Because most (but not all) news aggregators contain advertisements, it is likely that a court would find the use to be commercial, cutting against a finding of fair use. The fact that the websites are commercial does not end the inquiry into the first fair use factor, however. In addition to looking at whether the use is commercial in nature, courts also look at whether the use is “transformative” — namely, does the new work merely serve as a replacement for the original work, or does it instead add something new, either by repurposing the content, or infusing the content with a new expression, meaning, or message.

Applying the transformative test to the four categories of news aggregators yields slightly different results.

Applied to Feed Aggregators, the first fair use factor cuts slightly in favor of a finding of fair use because of the transformative nature of the categorization and indexing functions performed by the Feed Aggregators. The Ninth Circuit has repeatedly found that certain reproductions of copyrighted works by a search engine are a “transformative” use. In Kelly v. Arriba Soft Corp., the Ninth Circuit found that the reproduction of thumbnails of plaintiff’s photographs in defendant’s search engine results was transformative, noting that “[the search engine’s] use of the images serves a different function than [plaintiff’s] use — improving access to information on the internet versus artistic expression.” Likewise, in Perfect 10, Inc v. Amazon.com, Inc., the court noted the significant public benefit provided by Google’s image search “by incorporating an original work into a new work, namely, an electronic reference tool,” and observed that “a search engine may be more transformative than a parody because a search engine provides an entirely new use of the original work, while a parody typically has the same entertainment purpose as the original work.”

But, it is worth noting, the case for transformative use isn’t as strong for a news aggregation site as it was for a pure search engine. While the uses were clearly of a different nature in Kelly and Perfect 10 (artistic/entertainment purposes for the original photographs versus an informational searching and indexing function for the search engine’s reproduction of the images), a Feed Aggregator serves a similar function to a newspaper’s website — to collect and organize news stories so that they can be read by the public. Nonetheless, the Feed Aggregator does provide its user with the convenience of accessing stories from a large number of sources on one web page, categorizing those feeds and permitting searching of the feeds, which is at least minimally transformative.

In many cases, Specialty Aggregators will have an even stronger argument that their use is transformative. Specialty Aggregators have a narrower focus than many of the websites from which they draw material, providing readers with the benefit of collecting all (or most) of the reporting on a particular topic in one place. Specialty Aggregators thus contribute something new and socially useful by providing context and enabling comparisons between sources covering a story that would not otherwise be possible.

Similarly, User-Curated Aggregators can be viewed as somewhat more transformative than Feed Aggregators because users collect the stories. This feature enables the additional function of determining what stories are popular among a certain group of Internet users. User-Curated Aggregators often further the additional purpose of promoting community commentary on the posted stories.

In many cases, Blog Aggregators will have the strongest claim of a transformative use of the material because they often provide additional context or commentary alongside the material they use. Blog Aggregators also often bring to the material a unique editorial voice or topic of focus, further distinguishing the resulting use from the purpose of the original article.

The nature of the copyrighted work. In deciding whether the nature of the copyrighted work favors a finding of fair use, courts look to a number of factors, including, “(1) whether the work is expressive or creative, such as a work of fiction, or more factual, with a greater leeway being allowed to a claim of fair use where the work is factual or informational, and (2) whether the work is published or unpublished, with the scope for fair use involving unpublished works being considerably narrower.” Here, the factual nature of the news articles primarily used by all of types of news aggregators weighs slightly in favor of a finding of fair use. The Supreme Court has recognized that “[t]he law generally recognizes a greater need to disseminate factual works than works of fiction or fantasy.” Likewise, the fact that news aggregators are making use of published stories would weigh in favor of a finding of fair use.

The amount and substantiality of the portion used in relation to the copyrighted work as a whole. In evaluating this factor, courts look at the amount of the copyrighted work that is reproduced both quantitatively and qualitatively. Looked at from a quantitative perspective, most news aggregators use only a small portion of the original work — usually just the headline, and sometimes a few sentences from the lede. This would weigh in favor of finding fair use. Many content originators argue, however, that the portion of a story reproduced by news aggregators is much more significant when looked at from a qualitative perspective. This is because, they argue, the headline and lede often contain the most important parts of the story — in other words, they constitute the “heart” of the article. The Supreme Court, as well as a number of lower courts, has found that the reproduction of even a short excerpt can weigh against a finding of fair use if the excerpt reproduces the “heart” of the work. Given the factual nature of this inquiry, it is not possible to say definitively how courts would view all news aggregators. In some instances, the first few sentences may contain the heart of the work. In other instances this will not be the case.

The effect of the use on the potential market for the copyrighted work. This is perhaps the most hotly debated of the four fair use factors when it comes to the practice of news aggregation. Content originators like AFP, the AP, and others would argue that a well-defined market currently exists for the reproduction and syndication of news articles, and that news aggregators’ use of the content without paying a licensing fee directly threatens that market. Likewise, content originators are likely to argue that for many consumers, the use of their content by the news aggregators replaces the need for the original articles. In support of this contention, they can cite to studies like one recently released by the research firm Outsell, which found that 44 percent of Google News users scan the headlines without ever clicking through to the original articles on the newspapers’ websites.

In response, news aggregators like Google News are likely to argue that, despite studies like this, their services are still a net benefit to newspapers by driving traffic to their websites from consumers that would be unlikely to otherwise encounter their content. Further, news aggregators could argue that the type of consumer that would only skim the headlines and ledes on the news aggregators’ website is not the type of consumer that is likely to visit individual news websites and read full articles, and thus would be unlikely to be a source of traffic for the newspapers’ websites if the news aggregators did not exist. As the foregoing analysis shows, the question of whether news aggregators are making fair use of copyrighted content is a complicated inquiry, the outcome of which heavily depends on the specific facts of each case. Even within the four categories of aggregators discussed here, there is considerable variation in how the fair use factors would likely play out. Websites that reproduce only headlines, and not ledes, are likely to have an easier time making a case for fair use.

Hot news misappropriation

Another theory of liability that has been asserted against news aggregators is hot news misappropriation. The hot news misappropriation doctrine has its origins in a 1918 Supreme Court decision, International News Service v. Associated Press. The case arose from a unique set of circumstances involving two competing newsgathering organizations: the International News Service (“INS”) and the Associated Press (“AP”). Both the INS and AP provided stories on national and international events to local newspapers throughout the country, which subscribed to their wire services and bulletin boards. In this way, papers with subscriptions to either the INS or AP were able to provide their readers with news about far-flung events without undertaking the expense of setting up their own foreign bureaus.

During World War I, however, the two services were not equally well positioned to report on events occurring in the European theater. William Randolph Hearst, the owner of the INS, had been an outspoken critic of Great Britain and the United States’ entry into the war and openly sympathized with the Germans. In retaliation, Great Britain prohibited reporters for the INS from sending cables about the war to the United States, thus hampering INS’s ability to report on war developments. To ensure that its subscribers were still able to carry news about the war, INS engaged in a number of questionable practices, including bribing employees of newspapers that were members of the AP for pre-publication access to the AP’s reporting. At issue before the Supreme Court, however, was INS’s practice of purchasing copies of East Coast newspapers running AP stories about the war, rewriting the stories using the facts gleaned from the AP’s reporting, and sending the stories to INS’s subscribers throughout the United States. In some cases, this practice led to INS subscribers on the West Coast “scooping” the local competitor carrying the original AP story.

In order to prevent this activity, the Supreme Court crafted a new variant of the common law tort of misappropriation, referred to by commentators as the “hot news” doctrine. As set forth in the Court’s opinion, the essence of the tort is that one competitor free rides on another competitor’s work at the precise moment when the party whose work is being misappropriated was expecting to reap rewards for that work. The Court drew upon a view of property and human enterprise theories inspired by John Locke in establishing the common law doctrine of hot news misappropriation: it wanted to reward the AP for the time and expense involved in gathering and disseminating the news. The Court viewed INS’s activities, through which it was able to reap the competitive benefit of the AP’s reporting without expending the time and money to collect the information, as an interference with the normal operation of the AP’s business “precisely at the point where the profit is to be reaped, in order to divert a material portion of the profit from those who have earned it to those who have not.” The Court reasoned that “he who has fairly paid the price should have the beneficial use of the property,” sidestepping arguments that there is no true “property” to be had in the news by relying upon the court’s equitable powers to address unfair competition. The Court affirmed the circuit court’s decision, leaving in place an injunction against INS taking facts from the AP’s stories “until [the facts’] commercial value as news to the complainant and all of its members has passed away.”

The INS case was decided in a unique historical context that in some ways differs from the contemporary competitive landscape. At the time, there were relatively few news services able to undertake the costs and logistical hurdles of reporting on events in the European theater for newspaper readers in the United States. Thus, as a result of the British government’s sanctions against INS, the resulting costs of reporting on the war in Europe fell almost entirely on the AP. This was also the decade where the number of U.S. daily newspapers peaked. Every major city had multiple daily newspapers, and thirty minutes of lead time for a paper could mean thousands of extra readers that day.

In addition, INS was decided before the advent of modern First Amendment jurisprudence, which can largely be traced to two cases decided by the Supreme Court the following year: Abrams v. United States, 290 U.S. 616 (1919), and Schenck v. United States, 249 U.S. 47 (1919). Accordingly, the majority opinion in INS did not address the First Amendment at all, and Justice Brandeis’s famous dissent, while hinting at the tension between freedom of expression and the theory of hot news misappropriation, likewise failed to consider the First Amendment as an independent limitation on the brand new doctrine.

While the current competitive landscape is different than in the time of INS, the modern doctrine of hot news misappropriation relies on the same essential theoretical underpinnings as those outlined by the Supreme Court in that case. There is one key difference, however. While the Supreme Court in INS adopted the hot news misappropriation doctrine as federal common law, since INS, recognition of the misappropriation doctrine has shifted to the states. Today, only five states have adopted the INS hot news tort as part of state unfair competition law.

The modern hot news doctrine

The Second Circuit’s decision in NBA v. Motorola typifies the modern application of the hot news misappropriation doctrine and stands as its leading case. In NBA, the National Basketball Association sued Motorola over a pager service by which Motorola provided its customers with scores and other statistics about ongoing NBA basketball games. Motorola paid people to watch or listen to the games and upload game statistics into a data feed, which Motorola sent to its pager customers. The NBA claimed that Motorola’s operation of the pager service constituted a form of misappropriation and sought to enjoin the service.

At the start of its analysis, the Second Circuit Court of Appeals addressed whether or not the 1976 Copyright Act, which provides copyright protection only for original expression, preempted the state-law misappropriation claim. After looking at the legislative history behind the Act and using the “extra-element” test for preemption, the Second Circuit ruled that a narrow version of the hot news misappropriation tort survived the enactment of the 1976 Copyright Act. The NBA court formulated the elements of the surviving hot news tort as follows:

(i) a plaintiff generates or gathers information at a cost; (ii) the information is time- sensitive; (iii) a defendant’s use of the information constitutes free riding on the plaintiff’s efforts; (iv) the defendant is in direct competition with a product or service offered by the plaintiffs; and (v) the ability of other parties to free-ride on the efforts of the plaintiff or others would so reduce the incentive to produce the product or service that its existence or quality would be substantially threatened.

As articulated by the Second Circuit, the modern form of the misappropriation doctrine thus affords plaintiffs some limited copyright-like protection for facts under narrowly defined circumstances. Applying its test to the facts of the case, the Second Circuit found that the NBA failed to make out a hot news claim because operation of Motorola’s pager service did not undermine the NBA’s financial incentive to continue promoting, marketing, and selling professional basketball games. In other words, this was not a situation in which “unlimited free copying would eliminate the incentive to create the facts in the first place.”

The plaintiffs were more successful in a recent case out of the Southern District of New York. In Barclays Capital Inc. v. TheFlyOnTheWall.com, the district court issued a permanent injunction requiring the financial news website TheFlyOnTheWall.com (“Fly”) to delay its reporting of the stock recommendations of research analysts from three prominent Wall Street firms, Barclays Capital Inc., Merrill Lynch, and Morgan Stanley. The injunction, which issued after a finding by the district court that Fly had engaged in hot news misappropriation, requires Fly to wait until 10 a.m. EST before publishing the facts associated with analyst research released before the market opens, and to postpone publication for at least two hours for research issued after the opening bell. Notably, the injunction prohibits Fly from reporting on stock recommendations issued by the three firms even if such recommendations have already been reported in the mainstream press.

The decision is currently on appeal to the Second Circuit Court of Appeals. Like the Supreme Court in INS, however, both the Second Circuit in NBA and the district court in Barclays failed to undertake any analysis of whether the hot news misappropriation doctrine comports with the requirements of the First Amendment. Specifically, as the Supreme Court has recognized on many subsequent occasions, one of the principal aims of the First Amendment is to “secure the ‘widest possible dissemination of information from diverse and antagonistic sources.'” To that end, the Supreme Court has recognized — in cases decided subsequent to INS — that the First Amendment protects truthful reporting on matters of public concern. Since the hot news misappropriation doctrine contemplates restrictions on or liability for the publication of truthful information on matters of public concern, even when lawfully obtained, the doctrine as currently articulated raises First Amendment concerns. It is unclear at this point how a court would ultimately weigh the state interest in assisting news gatherers to reap the benefits of their work against the First Amendment interest in widely disseminating truthful information about matters of public import.

Application of the hot news misappropriation doctrine to news aggregators

Because of the lack of decisions on the merits in recent hot news misappropriation cases, it is difficult to determine how a court would ultimately apply the elements of the tort to news aggregators. Nonetheless, it is worth briefly reviewing the elements.

Plaintiff generates or gathers information at a cost. As to this factor, a plaintiff that undertook original reporting and had some (or all) of the contents of that reporting repurposed by news aggregators would likely be able to satisfy this prong. Unlike the fair use situation, however, Blog Aggregators may be more vulnerable to a hot news claim than Feed Aggregators or Specialty Aggregators, since the former usually incorporate more of the facts from a story in their work. In contrast, Feed Aggregators and Specialty Aggregators usually limit themselves to reproducing the headline and some portion of the lede of the source article, which may or may not contain information that was costly to gather. (User-Curated Aggregators are likely to fall somewhere in the middle, since additional information about the contents of the article will often appear in the comments below the article.)

The information is time-sensitive. This factor, rather than looking at the defendant’s use of the information, looks exclusively to the nature of the plaintiff’s information. Accordingly, application of this factor is unlikely to vary among our four types of news aggregators, and would be determined on a case-by-case basis.

Defendant’s use of the information constitutes free riding on the plaintiff’s efforts. Here, courts would likely look to the nature of the defendant’s use of the information. While plaintiffs are likely to characterize any use of their material without a license as “free riding,” Blog Aggregators that add additional information or context to a story are less likely to be considered free riders than a spam blog or service like All Headline News that merely rewrites and repurposes the plaintiff’s content. Likewise, Feed Aggregators, Specialty Aggregators and User-Curated Aggregators arguably add their own effort by collecting in one location information from many places on the web, making it more accessible to the public, although the Barclays court found that such aggregation activities were insufficient to overcome a finding that defendant’s activities constituted “free riding.”

The defendant is in direct competition with a product or service offered by the plaintiffs. In most of the hot news misappropriation cases decided to date, this has been one of the two most difficult prongs for plaintiffs to successfully establish. It is perhaps more likely that a Feed Aggregator like Google News or Yahoo! News would be found to be a direct competitor of a newspaper website, than a Specialty Aggregator, User-Curated Aggregator or Blog Aggregator. This is because Feed Aggregators can in some cases serve as a replacement for visiting the website of a newspaper like The New York Times, since they often cover many of the same stories, and the majority of the stories found on the newspapers’ websites are likely to be reproduced on the Feed Aggregator’s website. In contrast, a Specialty Aggregator like TechMeme would contain only a small subset of the articles one would find on the Times’ website, and thus would be likely to serve a different audience. (Of course, TechMeme would likely be considered a direct competitor of a highly-specialized publication like Macworld.)

Defendant’s actions would reduce the incentive to produce the information to a point where its existence or quality would be substantially threatened. This has likewise been a difficult prong for plaintiffs to establish in hot news misappropriation cases, and, in fact, formed the basis for the Second Circuit holding in favor of the defendant in NBA. Here, the analysis turns less on the type of aggregator, than the use the aggregator makes of the information. Two factors courts would likely consider important in determining whether a news aggregator engages in hot news misappropriation are (1) the extent to which viewing the information on the news aggregator’s website would replace reading the original content, and (2) the size and nature of the news aggregator’s readership. Thus, a Blog Aggregator that summarizes all of the relevant information from a news article or a Feed Aggregator that reproduces the entire lede of the story are likely to have a greater deleterious effect on the plaintiff’s incentive to invest in news gathering than a Feed Aggregator or Specialty Aggregator that displays only a headline or a few words from the lede. Likewise, a news aggregator with a small readership or a readership that did not significantly overlap with the plaintiff’s core readership would be unlikely to threaten the continued existence of a newspaper, while Google News or a website that targets the same consumers could perhaps be more damaging.

Conclusion

As the foregoing discussion illustrates, there is a good bit of legal uncertainty surrounding news aggregation activities, and it is difficult to provide a definitive answer in a paper like this. Both fair use and hot news misappropriation claims are highly fact specific. There is great variation in the legal analysis between different categories of news aggregators, as well as within the categories. Further, it remains to be seen whether the hot news misappropriation doctrine as currently formulated will remain viable in light of First Amendment concerns. Nonetheless, there are certain steps that news aggregators can take to mitigate their legal risks, as outlined in the “Best Practices” below. While the authors anticipate that the debate regarding news aggregators will continue to be fought in the courts and in public policy circles, we would like to sound a note of caution for those seeking to “save” journalism by addressing the issue of news aggregation. We are in the midst of a sea change in the way in which journalism is practiced in the United States. The past few years have seen an explosion of innovative approaches to both the practice and business of journalism. At a time of great flux in the media ecosystem, it would be premature, and likely counterproductive, to create rules which would have the effect, if not the purpose, of privileging one journalistic business model over others. In order for experimental business models to flourish, we need legal rules that promote flexibility and free access to information, not closed systems that tilt the playing field in favor of incumbents.

—

Best practices

If you are the creator of a news aggregation website, what should you do to protect yourself against lawsuits? Short of licensing all of the content you use, there are certain best practices that you can adopt that are likely to reduce your legal risk.

1. Reproduce only those portions of the headline or article that are necessary to make your point or to identify the story. Do not reproduce the story in its entirety.

2. Try not to use all, or even the majority, of articles available from a single source. Limit yourself to those articles that are directly relevant to your audience.

3. Prominently identify the source of the article.

4. Whenever possible, link to the original source of the article.

5. When possible, provide context or commentary for the material you use.

Photo by Carlos Lorenzo used under a Creative Commons license.

POSTED Sept. 8, 2010, 10:30 a.m.